# Image Classification Backbone
Focalnet Huge Fl4.ms In22k
MIT
FocalNet is an image classification model based on the focal modulation network, pretrained by the Microsoft team on the ImageNet-22k dataset.
Image Classification
Transformers

F
timm
103
0
Swinv2 Base Patch4 Window12to24 192to384 22kto1k Ft
Apache-2.0
Swin Transformer v2 is a vision transformer model that achieves efficient image classification and dense recognition tasks through hierarchical feature maps and local window self-attention mechanisms.
Image Classification
Transformers

S
microsoft
1,824
0
Swinv2 Base Patch4 Window12 192 22k
Apache-2.0
Swin Transformer v2 is a vision Transformer model that achieves efficient image processing through hierarchical feature maps and local window self-attention mechanisms.
Image Classification
Transformers

S
microsoft
8,603
3
Swinv2 Small Patch4 Window16 256
Apache-2.0
Swin Transformer v2 is a vision Transformer model that achieves efficient image processing through hierarchical feature maps and local window self-attention mechanisms.
Image Classification
Transformers

S
microsoft
315
1
Swinv2 Tiny Patch4 Window16 256
Apache-2.0
Swin Transformer v2 is a vision Transformer model that achieves efficient image classification through hierarchical feature maps and local window self-attention mechanisms.
Image Classification
Transformers

S
microsoft
403.69k
5
Swin Small Patch4 Window7 224
Apache-2.0
Swin Transformer is a hierarchical window-based vision Transformer model designed for image classification tasks, with computational complexity linearly related to input image size.
Image Classification
Transformers

S
microsoft
2,028
1
Swin Tiny Patch4 Window7 224
Apache-2.0
Swin Transformer is a hierarchical vision Transformer that achieves linear computational complexity by computing self-attention within local windows, making it suitable for image classification tasks.
Image Classification
Transformers

S
microsoft
98.00k
42
Swin Base Patch4 Window7 224
Apache-2.0
Swin Transformer is a hierarchical vision transformer based on shifted windows, suitable for image classification tasks.
Image Classification
Transformers

S
microsoft
281.49k
15
Swin Large Patch4 Window7 224
Apache-2.0
Swin Transformer is a hierarchical vision Transformer that achieves linear computational complexity by computing self-attention within local windows, making it suitable for image classification and dense recognition tasks.
Image Classification
Transformers

S
microsoft
2,079
1
Swin Large Patch4 Window7 224 In22k
Apache-2.0
Swin Transformer is a hierarchical vision transformer based on shifted windows, pretrained on the ImageNet-21k dataset, suitable for image classification tasks.
Image Classification
Transformers

S
microsoft
387
2
Swin Base Patch4 Window12 384
Apache-2.0
Swin Transformer is a hierarchical vision transformer based on shifted windows, specifically designed for image classification tasks, with computational complexity linear to input image size.
Image Classification
Transformers

S
microsoft
1,421
4
Featured Recommended AI Models